{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic RNNs in Tensorflow\n",
    "- 1 Layer of 5 recurrent neurons\n",
    "- Outputs of a layer of recurrent neurons for all instances in a mini-batch\n",
    "$$ \\textbf{Y}_{(t)} = \\phi \\big(  \\textbf{X}_{(t)} . \\textbf{W}_x + \\textbf{Y}_{(t-1)}^T . \\textbf{W}_y + b \\big) $$\n",
    "$$ = \\phi \\big( \\big[ \\textbf{X}_{(t)}  \\textbf{Y}_{(t-1)}  \\big] . \\textbf{W} + b  \\big)$$\n",
    "\n",
    "where $ \\textbf{W} = {\\textbf{W}_x\\brack \\textbf{W}_y}  $\n",
    "\n",
    "#### Notations in vectorized form:\n",
    "- $m =$ number of instances in mini-batch\n",
    "- $n_{neurons}$ number of neurons\n",
    "- $n_{inputs}$  number of input features\n",
    "- $dim(x)$ a function that determines shape or dimension of element $x$.\n",
    "- $\\textbf{Y}_{(t)}$ is the layer output at time $t$ for each instance of the mini-batch:\n",
    "$$ dim(\\textbf{Y}_{(t)}) = m\\times n_{neurons}$$ \n",
    "- $\\textbf{X}_{(t)}$ is a matrix containing the inputs for all instances:\n",
    "$$ dim(\\textbf{X}_{(t)}) = m \\times n_{inputs}$$\n",
    "- $\\textbf{W}_{x}$ is a matrix containing the connections weights for the inputs of the **current** time step:\n",
    "$$ dim(\\textbf{W}_{x}) = n_{inputs} \\times n_{neurons} $$\n",
    "- $\\textbf{W}_{y}$ is a matrix containing the connections werights for the outputs of the **previous** time step:\n",
    "$$ dim(\\textbf{W}_{y}) = n_{nuerons} \\times n_{neurons} $$\n",
    "- Weight matrices $\\textbf{W}_x$ and $\\textbf{W}_y$ are often concatenated into a single matrix $\\textbf{W}$ of shape $(n_{inputs} + n_{nuerons}) \\times n_{neurons}$\n",
    "- Bias term $b$ is just a 1-dimensional vector of size $1 \\times n_{neurons}$ "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### TODO: Put image of network we are building here"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 89,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "from tensorflow.contrib.layers import fully_connected\n",
    "import os\n",
    "import numpy as np\n",
    "\n",
    "tf.set_random_seed(1) # seed to obtain similar outputs\n",
    "os.environ['CUDA_VISIBLE_DEVICES'] = '' # avoids using GPU for this session"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Single Recurrent Neuron\n",
    "![alt txt](https://docs.google.com/drawings/d/e/2PACX-1vQXBLYvvI1dqAHdLA0hQdsP1PojmCfuSCMK2DXEL0uTvRUqvD1eYK8fsECcNCoekxCbgWJ-k7QF_1s4/pub?w=703&h=508)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 133,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Implementation of RNN with Single Neuron\n",
    "N_INPUTS = 4\n",
    "N_NEURONS = 1\n",
    "\n",
    "class SingleRNN(object):\n",
    "    def __init__(self, n_inputs, n_neurons):\n",
    "        self.X0 = tf.placeholder(tf.float32, [None, n_inputs])\n",
    "        self.X1 = tf.placeholder(tf.float32, [None, n_inputs])\n",
    "\n",
    "        self.Wx = tf.Variable(tf.random_normal(shape=[n_inputs, n_neurons], dtype=tf.float32))\n",
    "        self.Wy = tf.Variable(tf.random_normal(shape=[n_neurons, n_neurons], dtype=tf.float32))\n",
    "        b = tf.Variable(tf.zeros([1, n_neurons], dtype=tf.float32))\n",
    "\n",
    "        self.Y0 = tf.tanh(tf.matmul(self.X0, self.Wx) + b)\n",
    "        self.Y1 = tf.tanh(tf.matmul(self.Y0, self.Wy) + tf.matmul(self.X1, self.Wx) + b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 135,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now we feed input at both time steps\n",
    "# Generate mini-batch with 4 instances (i.e., each instance has an input sequence of exactly two inputs)\n",
    "#                 instance1 instance2 instance3 instance4                   \n",
    "X0_batch = np.array([[0,1,2,0], [3,4,5,0], [6,7,8,0], [9,0,1,0]]) # t = 0\n",
    "X1_batch = np.array([[9,8,7,0], [0,0,0,0], [6,5,4,0], [3,2,1,0]]) # t = 1\n",
    "\n",
    "model = SingleRNN(N_INPUTS, N_NEURONS)\n",
    "with tf.Session() as sess:\n",
    "    # initialize and run all variables so that we can use their values directly\n",
    "    init = tf.global_variables_initializer()\n",
    "    sess.run(init)\n",
    "    Y0_val, Y1_val, Wx, Wy = sess.run([model.Y0, model.Y1, model.Wx, model.Wy], feed_dict={model.X0: X0_batch, model.X1: X1_batch})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 136,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.55767643]\n",
      " [ 0.7142067 ]\n",
      " [ 0.98433769]\n",
      " [ 0.99762088]]\n"
     ]
    }
   ],
   "source": [
    "print(Y0_val)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 137,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.45043385]\n",
      " [ 0.74536854]\n",
      " [-0.68741149]\n",
      " [ 0.21806668]]\n"
     ]
    }
   ],
   "source": [
    "print(Wx)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Basic Layer of Recurrent Neurons\n",
    "![alt txt](https://docs.google.com/drawings/d/e/2PACX-1vQov6BGg1fXOb7Bg5zenPh7R5j6VsZJh_D6JevQ_sm_fCxmXORxad3qLIFGG1FojzJig0qdcAQoGYoN/pub?w=643&h=404)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 87,
   "metadata": {},
   "outputs": [],
   "source": [
    "# RNN unrolled through two time steps\n",
    "N_INPUTS = 3 # number of features in input\n",
    "N_NEURONS = 5\n",
    "\n",
    "class BasicRNN(object):\n",
    "    def __init__(self, n_inputs, n_neurons):\n",
    "        self.X0 = tf.placeholder(tf.float32, [None, n_inputs])\n",
    "        self.X1 = tf.placeholder(tf.float32, [None, n_inputs])\n",
    "\n",
    "        Wx = tf.Variable(tf.random_normal(shape=[n_inputs, n_neurons], dtype=tf.float32))\n",
    "        Wy = tf.Variable(tf.random_normal(shape=[n_neurons, n_neurons], dtype=tf.float32))\n",
    "        b = tf.Variable(tf.zeros([1, n_neurons], dtype=tf.float32))\n",
    "\n",
    "        self.Y0 = tf.tanh(tf.matmul(self.X0, Wx) + b)\n",
    "        self.Y1 = tf.tanh(tf.matmul(self.Y0, Wy) + tf.matmul(self.X1, Wx) + b)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 88,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Now we feed input at both time steps\n",
    "# Generate mini-batch with 4 instances (i.e., each instance has an input sequence of exactly two inputs)\n",
    "#                 instance1 instance2 instance3 instance4                   \n",
    "X0_batch = np.array([[0,1,2], [3,4,5], [6,7,8], [9,0,1]]) # t = 0\n",
    "X1_batch = np.array([[9,8,7], [0,0,0], [6,5,4], [3,2,1]]) # t = 1\n",
    "\n",
    "model = BasicRNN(N_INPUTS, N_NEURONS)\n",
    "with tf.Session() as sess:\n",
    "    # initialize and run all variables so that we can use their values directly\n",
    "    init = tf.global_variables_initializer()\n",
    "    sess.run(init)\n",
    "    Y0_val, Y1_val = sess.run([model.Y0, model.Y1], feed_dict={model.X0: X0_batch, model.X1: X1_batch})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 85,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.82335067  0.9679544  -0.87781847  0.15444483  0.47437516]\n",
      " [-0.99492419  0.99998653 -0.99989074 -0.62147528  0.99839431]\n",
      " [-0.99986637  1.         -0.99999982 -0.92323399  0.9999963 ]\n",
      " [-0.89664513 -0.75321907  0.9998976  -0.99999988  0.99999988]]\n"
     ]
    }
   ],
   "source": [
    "print(Y0_val) # output at t = 0 with 4 X 5 dimensions (m X n_neurons)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 86,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[-0.95814323  1.         -0.99999988 -0.60029292  0.9999997 ]\n",
      " [ 0.86120242  0.41333306 -0.38317758  0.97402877 -0.15842104]\n",
      " [-0.91327202  0.99999303 -0.99999648  0.35196573  0.99996436]\n",
      " [-0.99632466  0.89853793 -0.99984962 -0.99980336  0.9999882 ]]\n"
     ]
    }
   ],
   "source": [
    "print(Y1_val) # output at t = 1"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### RNNs using Static Unrolling Through Time\n",
    "\n",
    "![alt txt](https://docs.google.com/drawings/d/e/2PACX-1vQSmyivQcisygW5O7DrkeOd-FF3nUTH0CV_vwSg3hzFdvwaKhY18JYM-e0wWL7Bif6F66LdzkaTTXUK/pub?w=929&h=343)\n",
    "\n",
    "Deals with situations when dealing with extremely large inputs and outputs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [],
   "source": [
    "class StaticRNN(object):\n",
    "    def __init__(self, n_steps, n_inputs, n_neurons):\n",
    "        self.X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])\n",
    "        X_seqs = tf.unstack(tf.transpose(self.X, perm =[1,0,2]))\n",
    "        \n",
    "        basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons) # build copies of cell for each time step (unrolling).\n",
    "        output_seqs, states = tf.contrib.rnn.static_rnn(basic_cell, X_seqs, dtype=tf.float32) # does chaining for each input with the cells\n",
    "        \n",
    "        self.outputs = tf.transpose(tf.stack(output_seqs), perm=[1,0,2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [],
   "source": [
    "# must create 3D tensor as this is what Tensorflow RNN Cell requires\n",
    "X_batch = np.array([\n",
    "     # t = 0    t = 1\n",
    "    [[0,1,2], [9,8,7]], # instance 0\n",
    "    [[3,4,5], [0,0,0]], # instance 1\n",
    "    [[6,7,8], [6,5,4]], # instance 2 \n",
    "    [[9,0,1], [3,2,1]], # instnace 3\n",
    "])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(4, 2, 3)"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "X_batch.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 57,
   "metadata": {},
   "outputs": [],
   "source": [
    "tf.reset_default_graph() # resetting graph everything we run the code to avoid TF error\n",
    "N_STEPS = 2\n",
    "model = StaticRNN(N_STEPS, N_INPUTS, N_NEURONS)\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    \n",
    "    # initialize and run all variables so that we can use their values directly\n",
    "    init = tf.global_variables_initializer()\n",
    "    sess.run(init)\n",
    "    output_vals = model.outputs.eval(feed_dict={model.X: X_batch})       "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 58,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[[-0.35060367  0.85217297  0.6135031   0.46224031 -0.80418861]\n",
      "  [-0.93800652  0.99982429  0.99781042 -0.99988985  0.99659526]]\n",
      "\n",
      " [[-0.6539489   0.99790823  0.96494722 -0.80104691 -0.61586177]\n",
      "  [-0.90182787 -0.03686395  0.22155505 -0.3443155   0.51058269]]\n",
      "\n",
      " [[-0.83310556  0.99997252  0.99734652 -0.99106479 -0.31516016]\n",
      "  [-0.96041662  0.99637145  0.9749549  -0.99955434  0.98075169]]\n",
      "\n",
      " [[-0.96983677 -0.8127926   0.91314077 -0.99944443  0.99990416]\n",
      "  [-0.34946752  0.84921968  0.38088277 -0.9891333   0.75855315]]]\n"
     ]
    }
   ],
   "source": [
    "print(output_vals)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "There is still a problem with Static RNN version because it builds one cell per time step, which can amplify if we had a very large time steps. "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Dynamic Unrolling Through Time\n",
    "Basically runs a while loop over a cell the appropriate number of times. Definitely a much cleaner and computationally effecient way of building RNNs on tensorflow. Also, no need to unstack, stack, or transpose.\n",
    "\n",
    "Here we also handle variable length input sequences."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 68,
   "metadata": {},
   "outputs": [],
   "source": [
    "class DynamicRNN(object):\n",
    "    def __init__(self, n_steps, n_inputs, n_neurons):\n",
    "        # denotes the size of the sequence (helpful for supporting varying sizes of input sequences, e.g., sentences)\n",
    "        self.seq_length = tf.placeholder(tf.int32, [None]) \n",
    "        self.X = tf.placeholder(tf.float32, [None, n_steps, n_inputs]) # 3D tensor\n",
    "        \n",
    "        basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)\n",
    "        self.outputs, self.states = tf.nn.dynamic_rnn(basic_cell, self.X, dtype=tf.float32, sequence_length=self.seq_length)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 69,
   "metadata": {},
   "outputs": [],
   "source": [
    "# must create 3D tensor as this is what Tensorflow RNN Cell requires\n",
    "X_batch = np.array([\n",
    "     # t = 0    t = 1\n",
    "    [[0,1,2], [9,8,7]], # instance 0\n",
    "    [[3,4,5], [0,0,0]], # instance 1 (padded with a zero vector)\n",
    "    [[6,7,8], [6,5,4]], # instance 2 \n",
    "    [[9,0,1], [3,2,1]], # instnace 4\n",
    "])\n",
    "\n",
    "seq_length_batch = np.array([2,1,2,2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 73,
   "metadata": {},
   "outputs": [],
   "source": [
    "tf.reset_default_graph() # resetting graph everything we run the code to avoid TF error\n",
    "\n",
    "N_STEPS = 2\n",
    "model = DynamicRNN(N_STEPS, N_INPUTS, N_NEURONS)\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    \n",
    "    # initialize and run all variables so that we can use their values directly\n",
    "    init = tf.global_variables_initializer()\n",
    "    sess.run(init)\n",
    "    outputs_val, states_val = sess.run([model.outputs, model.states],\n",
    "                                       feed_dict={model.X: X_batch, model.seq_length: seq_length_batch})     "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 74,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[[-0.37425023  0.74163836 -0.83193213 -0.77348614  0.73784202]\n",
      "  [ 0.03848177 -0.35700297 -0.77217108 -0.61098522  0.99779701]]\n",
      "\n",
      " [[-0.36137825  0.78198385 -0.97801197 -0.93518633  0.98954976]\n",
      "  [ 0.          0.          0.          0.          0.        ]]\n",
      "\n",
      " [[-0.34836701  0.81669199 -0.99730986 -0.98258781  0.99963427]\n",
      "  [-0.10163038 -0.43126816  0.2514632  -0.0913566   0.92772639]]\n",
      "\n",
      " [[ 0.99900454 -0.99995887  0.85338706  0.9904601  -0.89884502]\n",
      "  [ 0.68933213 -0.73745525 -0.86696005 -0.27243653  0.93952006]]]\n"
     ]
    }
   ],
   "source": [
    "print(outputs_val) # pairs t = 0 ,  t = 1 and outputs them as multidimensional array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 75,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[ 0.03848177 -0.35700297 -0.77217108 -0.61098522  0.99779701]\n",
      " [-0.36137825  0.78198385 -0.97801197 -0.93518633  0.98954976]\n",
      " [-0.10163038 -0.43126816  0.2514632  -0.0913566   0.92772639]\n",
      " [ 0.68933213 -0.73745525 -0.86696005 -0.27243653  0.93952006]]\n"
     ]
    }
   ],
   "source": [
    "print(states_val) # notice that states contain the final state (i.e, t = 1)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Training RNN Classifier on MNIST\n",
    "\n",
    "![alt txt](https://docs.google.com/drawings/d/e/2PACX-1vSiMstqkE9hTYmhPD3KMeFRNNKYA2NnrCayahBOEL1TalRqaWF7rH8a7O-nP9c-mKOdZRsWtmAGZfNN/pub?w=969&h=368)\n",
    "\n",
    "Note that even though image classification can be done more effectively using CNN, RNNs will still perform well since the sequence is also important in the process of drawing digits."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 122,
   "metadata": {},
   "outputs": [],
   "source": [
    "class ImageRNN(object):\n",
    "    def __init__(self, n_steps, n_inputs, n_neurons, n_outputs):\n",
    "        self.X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])\n",
    "        self.y = tf.placeholder(tf.int32, [None])\n",
    "        \n",
    "        basic_cell = tf.contrib.rnn.BasicRNNCell(num_units=n_neurons)\n",
    "        outputs, states = tf.nn.dynamic_rnn(basic_cell, self.X, dtype=tf.float32)\n",
    "        \n",
    "        # computes loss\n",
    "        logits = fully_connected(states, n_outputs, activation_fn=None) # log probabilities\n",
    "        self.loss = tf.reduce_mean(tf.nn.sparse_softmax_cross_entropy_with_logits(labels=self.y, logits=logits)) \n",
    "        \n",
    "        # evaluation (accuracy)\n",
    "        correct =  tf.nn.in_top_k(logits, self.y, 1) # tf.equal(tf.argmax(logits, 1), tf.argmax(self.y, 1))\n",
    "        self.accuracy = tf.reduce_mean(tf.cast(correct, tf.float32))        "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 123,
   "metadata": {},
   "outputs": [],
   "source": [
    "# parameters \n",
    "N_STEPS = 28\n",
    "N_INPUTS = 28\n",
    "N_NEURONS = 150\n",
    "N_OUTPUTS = 10\n",
    "N_EPHOCS = 10\n",
    "BATCH_SIZE = 150\n",
    "DISPLAY_STEP = 1\n",
    "LEARNING_RATE = 0.001"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 124,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Extracting MNIST_data/train-images-idx3-ubyte.gz\n",
      "Extracting MNIST_data/train-labels-idx1-ubyte.gz\n",
      "Extracting MNIST_data/t10k-images-idx3-ubyte.gz\n",
      "Extracting MNIST_data/t10k-labels-idx1-ubyte.gz\n"
     ]
    }
   ],
   "source": [
    "from tensorflow.examples.tutorials.mnist import input_data\n",
    "\n",
    "mnist = input_data.read_data_sets(\"MNIST_data/\")\n",
    "X_test = mnist.test.images.reshape((-1, N_STEPS, N_INPUTS))\n",
    "y_test = mnist.test.labels"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 125,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "Epoch: 0001 cost= 0.512163897\n",
      "Train Accuracy:  0.946667   Test Accuracy:  0.9265\n",
      "Epoch: 0002 cost= 0.206084174\n",
      "Train Accuracy:  0.96   Test Accuracy:  0.9477\n",
      "Epoch: 0003 cost= 0.159465483\n",
      "Train Accuracy:  0.986667   Test Accuracy:  0.959\n",
      "Epoch: 0004 cost= 0.133277364\n",
      "Train Accuracy:  0.973333   Test Accuracy:  0.9529\n",
      "Epoch: 0005 cost= 0.122922636\n",
      "Train Accuracy:  0.973333   Test Accuracy:  0.9576\n",
      "Epoch: 0006 cost= 0.106411702\n",
      "Train Accuracy:  0.966667   Test Accuracy:  0.9624\n",
      "Epoch: 0007 cost= 0.100989035\n",
      "Train Accuracy:  0.966667   Test Accuracy:  0.9702\n",
      "Epoch: 0008 cost= 0.089291672\n",
      "Train Accuracy:  0.993333   Test Accuracy:  0.9682\n",
      "Epoch: 0009 cost= 0.083972478\n",
      "Train Accuracy:  0.986667   Test Accuracy:  0.9721\n",
      "Epoch: 0010 cost= 0.083262836\n",
      "Train Accuracy:  0.986667   Test Accuracy:  0.9749\n",
      "Training Finished!!!\n"
     ]
    }
   ],
   "source": [
    "tf.reset_default_graph()\n",
    "\n",
    "# build model\n",
    "model = ImageRNN(N_STEPS, N_INPUTS, N_NEURONS, N_OUTPUTS)\n",
    "\n",
    "# training procedures (backpropagation)\n",
    "optimizer = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE)\n",
    "training_op = optimizer.minimize(model.loss)\n",
    "\n",
    "with tf.Session() as sess:\n",
    "    # initialize and run all variables\n",
    "    init = tf.global_variables_initializer()\n",
    "    sess.run(init)\n",
    "    \n",
    "    for epoch in range(N_EPHOCS):\n",
    "        avg_cost = 0. # average loss\n",
    "        total_batch = mnist.train.num_examples // BATCH_SIZE\n",
    "        for iteration in range(total_batch): # note iterations are depended on batch size (increase for faster computation)\n",
    "            X_batch, y_batch = mnist.train.next_batch(BATCH_SIZE)\n",
    "            X_batch = X_batch.reshape((-1, N_STEPS, N_INPUTS))\n",
    "            _, cost = sess.run([training_op, model.loss], feed_dict={model.X: X_batch, model.y: y_batch})\n",
    "            avg_cost += cost / total_batch\n",
    "        \n",
    "        acc_train = model.accuracy.eval(feed_dict={model.X: X_batch, model.y: y_batch})\n",
    "        acc_test = model.accuracy.eval(feed_dict={model.X: X_test, model.y: y_test})\n",
    "        \n",
    "        # Display cost, accuracy (test, train) based on display step\n",
    "        if (epoch+1) % DISPLAY_STEP == 0:\n",
    "            print(\"Epoch:\", '%04d' % (epoch+1), \"cost=\", \"{:.9f}\".format(avg_cost))\n",
    "            print(\"Train Accuracy: \", acc_train, \" \", \"Test Accuracy: \", acc_test)\n",
    "            \n",
    "    print(\"Training Finished!!!\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "---"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### References:\n",
    "- [All sorts of Text Classificaiton Deep Learning models](https://github.com/brightmart/text_classification)\n",
    "- [NTHU Machine Learning](https://nthu-datalab.github.io/ml/labs/13_Sentiment_Analysis_and_Neural_Machine_Translation/13_Sentiment_Analysis_and_Neural_Machine_Translation.html)\n",
    "- [Introduction to RNN](http://www.wildml.com/2015/09/recurrent-neural-networks-tutorial-part-1-introduction-to-rnns/)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
